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SPECTRAL GAPS IN WASSERSTEIN DISTANCES AND THE 2D 
STOCHASTIC NAVIER STOKES EQUATIONS 

By Martin Hairer 1 and Jonathan C. Mattingly 2 

University of Warwick and Duke University 

We develop a general method to prove the existence of spectral 
gaps for Markov semigroups on Banach spaces. Unlike most previ- 
ous work, the type of norm we consider for this analysis is neither 
a weighted supremum norm nor an L p -type norm, but involves the 
derivative of the observable as well and hence can be seen as a type of 
1-Wasserstein distance. This turns out to be a suitable approach for 
infinite-dimensional spaces where the usual Harris or Doeblin condi- 
tions, which are geared toward total variation convergence, often fail 
to hold. In the first part of this paper, we consider semigroups that 
have uniform behavior which one can view as the analog of Doeblin's 
condition. We then proceed to study situations where the behavior 
is not so uniform, but the system has a suitable Lyapunov struc- 
ture, leading to a type of Harris condition. We finally show that the 
latter condition is satisfied by the two-dimensional stochastic Navier- 
Stokes equations, even in situations where the forcing is extremely 
degenerate. Using the convergence result, we show that the stochastic 
Navier-Stokes equations' invariant measures depend continuously on 
the viscosity and the structure of the forcing. 

1. Introduction. This work is motivated by the study of the two-dimen- 
sional stochastic Navier-Stokes equations on the torus. However, the results 
and techniques are more general. The main abstract result of the paper gives 
a criterion guaranteeing that a Markov semigroup on a Banach space has a 
spectral gap in a particular 1-Wasserstein distance. (In the sequel, we will 
simply write Wasserstein for 1-Wasserstein.) To the best of our knowledge, 



Received May 2006; revised May 2007. 

Supported in part by EPSRC Advanced Research Fellowship Grant EP/D071593/1. 
Supported in part by NSF PECASE Award DMS-04-49910 and an Alfred P. Sloan 
foundation fellowship. 

AMS 2000 subject classifications. 37A30, 37A25, 60H15. 

Key words and phrases. Stochastic PDEs, Wasserstein distance, ergodicity, mixing, 
spectral gap. 

This is an electronic reprint of the original article published by the 
Institute of Mathematical Statistics in The Annals of Probability. 
2008, Vol. 36, No. 6, 2050-2091. This reprint differs from the original in 
pagination and typographic detail. 



1 



2 



M. HAIRER AND J. MATTINGLY 



these results are the first results providing a spectral gap in this, or any sim- 
ilar, setting. In turn, the existence of a spectral gap implies that the Markov 
semigroup possesses a unique, exponentially mixing invariant measure. 

The results of this article rely on the existence of a "gradient estimate" in- 
troduced in [21] in the study of the degenerately forced Navier-Stokes equa- 
tions on the two-dimensional torus. This estimate was used there in order 
to show that the corresponding Markov semigroup satisfies the "asymptotic 
strong Feller" property, also introduced in [21]. In this work, we show that 
gradient estimates of this form not only are useful to show uniqueness of the 
invariant measure, but are an essential ingredient to obtain the existence 
of a spectral gap for a large class of systems. In this introductory section, 
we concentrate on the two-dimensional stochastic Navier-Stokes equations 
on a torus to show how the main results can be applied. At the end of this 
section, we give an overview of the paper. 

Recall that the Navier-Stokes equations describing the evolution of the 
velocity field v(x,t) (with x G T 2 ) of a fluid under the influence of a body 
force F(x) +F{x,t) are given by 

(SNS) d t v = vAv - (v • V)u - Vp + F + F, divv = 0, 

where the pressure p(x, t) is determined by the algebraic condition divv = 0. 
We consider for F a Gaussian stochastic forcing, that is, centered, white in 
time, colored in space and such that / F(x) dx = J F(x) dx = 0. Since the 
gradient part of the forcing is canceled by the pressure term, we assume 
without loss of generality that divF = divF = 0. More precisely, we assume 
that for i,j £ {1,2} 

EF i (x,t)F j (x',t')=5(t 

2 

E ajQ« = o, 

Although we are confident that our results are valid for Q sufficiently smooth, 
we restrict ourselves to the case where Q is a trigonometric polynomial, so 
that we can make use of the bounds obtained in [21, 39]. 

Instead of considering the velocity (SNS) directly, we will consider the 
equation for the vorticity ro = VAt; = d±V2 — c^i- Note that v is uniquely 
determined from w (we will write v = fCw) through the conditions 

w = VAv, divv = 0, Jv(x)dx = 0. 

When written in terms of w, (SNS) is equivalent to 



t )Qij (x X ) , 

J Qij{x)dx = 0. 



(1) 



d t w = uAw - (fZw) -Vw + f + f, 
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where we have defined / = VAF and / = V A F. Note that since / is 
translation invariant, one can write it as 

/(*,*)= Re Qke lkx tk(t), 

fc£Z 2 \{0} 

where the are independent white noises and where qk = q~k- We can 
therefore identify the correlation function Q with a vector q in the set of 
square integrable sequences with positive entries. Denoting by Z the set of 
indices k for which qk 7^ 0, we will make throughout this article the following 
assumptions: 

Assumption 1. Only finitely many of the q^s are nonzero and / lies 
in the span of {e lkx \ q^ 7^ 0}. Furthermore, Z generates Z 2 and there exist 
k,ieZ with \k\^\£\. 

Remark 1.1. The assumption that only a finite number of qk are nonzero 
is only a technical assumption reflecting a deficiency in [39] . All of the results 
of this article certainly hold if the first part of Assumption 1 is replaced by 
an appropriate decay property for the qk- Note, for example, that in [21], 
Section 4.5, it is shown that there exists an iV* such that if the range of Q 
contains {e jfcx ||fc| < N*}, f is as in Assumption 1, and J2Qk < °°> then all of 
the results of this paper hold. In particular, this allows infinitely many qk 
to be nonzero. 

Remark 1.2. Using the results in [3] one can remove the restriction that 
the forcing need consist of Fourier modes and replace it with the requirement 
that the forced functions span the Fourier modes required above. Since this 
is not the main point of this article, we do not elaborate further here. 

Remark 1.3. It is clear that the assumption that / £ span{e lkx \ qk 7^ 0} 
is far from optimal. The correct result likely places no restriction on / other 
than it be sufficiently smooth. This more delicate result requires an improved 
understanding of the control problem obtained by replacing the noise by 
controls. Some steps in this direction have been made [1, 2, 47], but the 
current results are not sufficient for our needs. Nonetheless, the present 
assumption on / seems reasonable from a modeling perspective where one 
would likely have some noise in all of the directions on which the body forces 
act. 

We will consider (SNS) as an evolution equation in the subspace Ti. of 
H 1 that consists of velocity fields v with divf = in the sense of distri- 
butions. Note that this is equivalent to wGL 2 . We make a slight abuse of 
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notation and denote by Pt the transition probabilities for (SNS), as well as 
the corresponding Markov semigroup on 7i, that is, 

Pt(v,A) = P(v(t,-)eA\v(p,-)=v), 

for every Borel set AcTi, and 

{V t <t>){v) = f 4>{v')V t (v,dv'), (V; f i)(A)= f P t (v,A)n(dv) 

for every <f) : Ti — ► R and probability measure u on Ti. Analogously we define 
the projection n<j) = f (j>(x) fx(dx) . It was shown in [21] that Assumption 1 
implies that (SNS) admits a unique invariant measure that is, is a 
probability measure on Ti such that P* = //* for every t > 0. 

This article is concerned with whether, for an arbitrary probability mea- 
sure /z, PIil — > /i* (as t — > oo) and in what sense this convergence takes place. 
Note that (SNS) is not expected to have the strong Feller property, so that 
it is a fortiori not expected that P^n — ► /U* in the total variation topology 
if fi and fi+ are mutually singular. (See [9] for a general discussion of the 
strong Feller property in infinite dimensions and [21] for a discussion of its 
limitations in the present setting.) 

In order to state the main result of the present article, we introduce the 
following norm on the space of smooth observables <j> : Ti — > R: 

||0||, = sup e -"IHI 2 (|0(„)| + 11^)11). 

Here, we denoted by D(f> the Frechet derivative of (f). With this notation, we 
will show that the operator Pt has a spectral gap in the norm || • |L in the 
following sense: 

Theorem 1.4. Consider (SNS) in the range of parameters allowed by 
Assumption 1. For every n small enough there exist constants C and 7 such 
that 

WPtt-^W^Ce^UWr,, 
for every Frechet differentiable function (j):Ti — > R and every t > 0. 

In [19] a similar operator-norm estimate on Pt was obtained in a weighted 
total variation norm (|| • |L without the ||.D<^|| term) when the forcing was 
spatially rough and nondegenerate. Our setting is quite different. The spa- 
tially rough and nondegenerate forcing makes the analysis much closer to 
the finite-dimensional setting. It is not expected that such estimates hold 
in the total variation norm in the setting of this article. We should also 
remark that previous estimates, such as [20, 29, 37, 38], giving simply expo- 
nential convergence to equilibrium are weaker and the results in this article 
represent a significant and new extension of those results. 
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It is sometimes of interest to know that the structure functions of the so- 
lution to (SNS) converge to the structure functions determined by /i*. This 
is not an immediate consequence of Theorem 1.4 because point evaluations 
of the velocity field are not continuous functions on 7i. The smoothing prop- 
erties of (SNS) nevertheless allow us to show the following result, which is 
an immediate consequence of Theorem 1.4 and Proposition 5.12 below. 

Theorem 1.5. Consider (SNS) in the range of parameters allowed by 
Assumption 1. Let n > 1 and define the n-point structure functions by 



Then, for every rj > 0, there exist constants C and 7 > such that, for every 
vq £ H, one has the bound 



for every t > 1. Here, v(x,t) is the solution of (SNS) with initial condition 
v . 

The remainder of this article is structured as follows. In Section 2, we 
begin with an abstract discussion of our ideas in a setting with uniform 
estimates. In Section 3, we give the main theoretical results of the paper 
which combine the ideas from the first section with estimates stemming from 
an assumed Lyapunov structure. The convergence is measured in a distance 
in which paths are weighted by the Lyapunov function. We then turn in 
Section 5 to the specifics of the stochastic Navier-Stokes equation and show 
that it satisfies the general theorems from Section 3. In Section 5.3, we show 
that the Markov semigroup generated by (SNS) is strongly continuous on a 
suitable Banach space and that its generator has a spectral gap there. Then 
in Section 5.5, we demonstrate the power of the spectral gap estimates by 
giving a short proof that (SNS)'s invariant measures depend continuously 
on all the parameters of the equation. 

2. A simplified, uniform setting. In this section, we illustrate many of 
the main ideas used throughout this article in a simplified setting. We con- 
sider the analogue of one of the simplest (and yet powerful) conditions for 
a Markov chain with transition probabilities V to have a unique invariant 
measure, namely Doeblin's condition 3 : 



3 Doeblin's original condition was the existence of a probability measure v and a con- 
stant e > such that V(x,A) > e whenever v(A) > 1 — e; see [11, 12]. It turns out that, 
provided that the Markov chain is aperiodic and 1/1-irreducible, this is equivalent to the 
(in general stronger) condition given here; see [40], Theorem 16.0.2. 




n 



sup ~ET\v(xi,t) — S n {x\,...,x n ) <Ce v 
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Theorem 2.1 (Doeblin). Assume that there exist S > and a probability 
measure v such that V(x,-) > bv for every x. Then, there exists a unique 
probability measure fx* such that V* fi* = //*. Furthermore, one has \\V(f) — 
< (1 — 5)\\4> — n*(p\\oo for every bounded measurable function 4>. 

A typical example of a semigroup for which Theorem 2.1 can be applied is 
given by a nondegenerate diffusion on a smooth compact manifold. Theorem 
2.1 shows the fundamental mechanism for convergence to equilibrium in total 
variation norm. It is simple because the assumed estimates are extremely 
uniform. In this section we give a theorem guaranteeing convergence in a 
Wasserstein distance with assumptions analogous to Doeblin's condition. 

A classical generalization of Doeblin's condition was made by Harris [22] 
who showed how to combine the existence of a Lyapunov function and a 
Doeblin-like estimate localized to a sufficiently large compact set to prove 
convergence to equilibrium. We will give a "Harris-like" version of our results 
in Section 3. 

2.1. Spectral gap under uniform estimates. The aim of this section is to 
present a very simple condition that ensures that a Markov semigroup Vt on 
a Banach space 7i yields a contraction operator on the space of probability 
measures endowed with a Wasserstein distance. One can view it as a ver- 
sion of Doeblin's condition for the Wasserstein distance instead of the total 
variation distance. The main motivation for using a distance that metrizes 
the topology of weak convergence is that probability measures on infinite- 
dimensional spaces tend to be mutually singular, so that strong convergence 
is not expected to hold in general, even for ergodic systems. 

The first assumption captures the regularizing effect of the Markov semi- 
group. While it does not imply that one function space is mapped into a 
more regular one as often occurs, it does say that at least gradients are 
decreased. 

Assumption 2. There exist constants ati 6 (0, 1), C > and T\ > such 
that 



for every t>T\ and every Frechet differentiable function <f> : fi — ► R. 

Remark 2.2. A typical way of checking (2) is to first show that for every 
t>0,Vt is bounded as an operator on the space with norm \\4>\\oo + ||-D^||oo- 
It then suffices to check that (2) holds with ot\ < 1 for one particular time t 
to deduce from the semigroup property that 



(2) 
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Remark 2.3. If the space Ti is actually a compact manifold, (2) together 
with the Arzela-Ascoli theorem implies that the essential spectral radius 
of Vt (as an operator on the space with norm \\4>\\oo + H-f^Hoo) is strictly 
smaller than 1. This is a well-known and often exploited fact 4 in the theory 
of dynamical systems. A bound like (2) is usually referred to as the Lasota- 
Yorke inequality [30, 31] or the Ionescu-Tulcea-Marinescu inequality [26] 
and is used to show the existence of absolutely continuous invariant measures 
when Vt is the transfer operator acting on densities. Usually, it is used with 
two different Holder norms on the right-hand side. The present application 
with a Lipschitz norm and an infinity norm has a different flavor. 

This bound alone is of course not enough in general to guarantee the 
uniqueness of the invariant measure. (Counterexamples with TL = S\ the 
unit circle, can easily be constructed.) Furthermore, we are mainly interested 
in the case where Ti is not even locally compact. 

In order to formulate our second assumption, we use the notation C^\[i-i) 
for the set of all measures r on Ti x Ti such that T(A x Ti) = fJ,\(A) and 
T(Ti x A) = ^2(^4) for every Borel set A C Ti. Such a measure T on the 
product space is referred to as a coupling of [i\ and Hi. We also denote 
by VI the semigroup acting on probability measures which is dual to Vt- 
With these notation, our second assumption, which is a form of uniform 
topological irreducibility, reads: 

Assumption 3. For every 6 > 0, there exists a T 2 = T 2 (<5) so that for 
any t > T 2 there exists an a > so that 

sup r{(a/, y) £Ti xTi: \\x — y'\\ < 5} > a, 
for every x,y £H. 

Remark 2.4. Recall that the total variation distance between proba- 
bility measures can be characterized as one minus the supremum over all 
couplings of the mass of the diagonal. Therefore, if we set 5 = in Assump- 
tion 3, we get \\Vt{x,-) — Vt{y, -)\\tv < 1 — a for every x and y. By [40], 
Theorem 16.0.2, this is equivalent to the assumption of Theorem 2.1, so 
that the results in this section can be viewed as an analogue of Doeblin's 
theorem. 



4 It can be obtained as a corollary of the fact that the essential spectral radius of an 
operator T can be characterized as the supremum over all A > such that there exists a 
singular sequence {/„} n >o with ||/„[| = 1 and ||T/n|| > A for every n. A slightly different 
proof can be found in [23] and is directly based on the study of the essential spectral 
radius by Nussbaum [42]. It is, however, very close in spirit to the much earlier paper [26]. 
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To measure the convergence to equilibrium, we will use the following 
distance function on 7i: 

(3) d(x,y) = mm{l,5~ 1 \\x-y\\}, 

where 5 is a small parameter to be adjusted later on. The distance (3) 
extends in a natural way to a Wasserstein distance between probability 
measures by 

(4) d(ni,n 2 )= sup 

LiPdW<l 



4>(x)ni(dx) — I 4>{x)[i2{dx 



where Lip^(0) denotes the Lipschitz constant of <f> in the metric d. By the 
Monge-Kantorovich duality [44, 49], the right-hand side of (4) is equivalent 
to 



(5) d(fj, 1 ,n 2 )= inf d(x,y)n(dx,dy). 

li&C(ni 412) J J 

(Note that the infimum is actually achieved; see [50], Theorem 4.1.) With 
these notation, one has the following convergence result: 

Theorem 2.5. Let (Vt)t>o be a Markov semigroup over a Banach space 
H satisfying Assumptions 2 and 3. Then, there exist constants 5 > 0, a < 1 
and T > such that 

(6) d(PTHx,VTii2)<ad{n\,ii2)> 

for every pair of probability measures ^2 on H. In particular, (Vt)t>0 
has a unique invariant measure ^ and its transition probabilities converge 
exponentially fast to 

Proof. We will prove the general result by first proving it for delta 
measures, namely, 

(7) d(V;5 x ,V;S y )<ad(x,y) 

for all (x, y) £TC x 7i. Once this estimate is proven, we can finish the proof 
of the general case by the following argument. 

The Monge-Kantorovich duality yields Q G C(//i, ^2) so that d(/xi,/i2) = 
/ d(x,y)Q(dx,dy). To complete the proof observe that 



d(rtfii,PM< J d(r;5 x ,r;5 y )Q(dx,dy) 

< a d(x,y)Q(dx,dy) = ad(fi\, [12)- 



SPECTRAL GAPS IN WASSERSTEIN DISTANCES 



9 



Let us first show that (7) holds when ||a; — y|| < S for some appropriately 
chosen 5. Note that by (4) this is equivalent to showing that 

(8) \Vt0(x)-V t 0(y)\<ad(x,y)=a5- 1 \\x-y\\ 

for all smooth eft with Lip d (0) < 1. Note that we can assume (f)(0) = 0, so that 
this implies that H-D^Hoo < 5" 1 and ||^||oo < 1- It follows from Assumption 2 
that WDVttWoo <C + a^- 1 for every t>T ± . Choosing S = (1 - a 1 )/(2C) 
and substituting in for C, we get \\DV t (f)\\oo < 5~ 1 (l + a 1 )/2, so that (8) 
holds for t > Ti and a > (1 + ai)/2. 

Let us now turn to the case \\x — y\\ > 5. It follows from Assumption 3 
that for every t > T2(5) there exists a positive constant a so that for any 
(x,y) £ 7i 2 there exists T £ C(V?5 x ,V?5 y ) such that T(A 5 ) > a > 0, where 

A s = {(x',y'):\\x'-y'\\<l6}. 

Since d(x',y') < | on As and d(x',y') < 1 on the complement, one has 

J d(x', y')T(dx', dy>) < h(A s ) + 1 - T(A S ) = 1 - ^T(A S ) < 1 - |. 

Since we are in the setting d(x,y) = 1, this implies that when ||x — y\\ > 5, 

\V t <Kx)-V t <Kv)\<ad(x,y) 

holds for a > 1 - § and i > T 2 (<J). 

Setting a = max{l — |,^(l + ai)} and T = max{Ti, Ti(8)} completes the 
proof. □ 

Corollary 2.6. Let Vt be as in Theorem 2.5. Then, there exist con- 
stants a < 1 and T > such that 

(9) ||PT0-M>||l,oo<a|H|l,oo, \\</>\\ 1>00 = 8wp(\<j>(x)\ + \\D(f>(x)\\), 

for every Frechet differentiable function : Ti — > R. 

Proof. Define d\(x,y) = 1 A ||ai — y||. Since d is equivalent to cq, (6) still 
holds for arbitrary a (but with a different value for T) with d replaced by 
d\ . The claim then follows from the Monge-Kantorovich duality, noting that 
LiPdi(^) — 2||0||i )OO and, for functions (f> with / 4>(x)/i- k (dx) = 0, ||^>||i,oo — 
Lip dl (</>)• □ 
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2.2. A more pathwise perspective. In [20, 37, 38], the authors advocated 
a pathwise point of view which explicitly constructed coupled versions of 
the process starting from two different initial conditions in such a way that 
the two coupled processes converged together exponentially fast. This point 
of view is very appealing as it conveys a lot of intuition; however, writing 
down the details can become a bit byzantine. Hence the authors prefer the 
arguments given in the preceding section for their succinctness and ease of 
verification. Nonetheless, the calculations of the present section provided the 
intuition which guided the previous section; and hence, we find it instructive 
to present them. As none of the rest of the paper uses any of the calculations 
from this section, we do not provide all of the details. Our goal is to show how 
the estimates from the previous section can be used to construct an explicit 
coupling in which the expectation of the distance between the trajectories 
starting from xq and yo converges to zero exponentially in time. 

Fix a t > max(Ti,T2), where T\ and T2 are the constants in Assumptions 
2 and 3. Fix 5 > as we did in the proof of Theorem 2.5. Now for k = 0, 1, . . . 
define the following sequence of stopping times: 

r k = inf{m > : m E N, ||sc m t - y mt \\ < (1 - a x )5}, 

s k = inf{m > r k :m E N, \\x mt - y mt \\ > 5}, 

where s_i = by definition. 

If n E [rk,s k ), let the distribution of (xt( n +i),yt(n+i)) be given by a cou- 
pling which minimizes the Ed(x t ^ n+l ^,y t ^ n+1 )). Hence choosing 6 = (1 — 
a\)/{2C) and a E ((1 + ai)/2,l) as in the paragraph below (8), Monge- 
Kantorovich duality gives that 

(10) E{d{x t{n +l),yt(n+l))\(xtn,ytn)) < Ct5~ l \\x tn ~ Vtn\\ 

provided n E [rk,s k ). Given a random variable X and events A and B, 
for notational convenience, we define ~E(X;A) = 'E(XIa) and P(A;B) = 
E(1,41b), where 1a is the indicator function on the event A. Observe that 
if ||xo — 2/0 1 1 < (1 — oti)5, then 

(11) E(||x t(n+1 ) -y t („ + i)||;si >n+l\(x tn ,ytn)) < a\\x tn - ytn\\Ui>n, 
which implies that 

E (IK(n+i) - y t (n+i)\\; si >n + l) < a n+1 ||a;o -yo||- 

From this we see that as long as x and y stay in a S ball of each other, they 
will converge toward each other exponentially in expectation. 
Observe furthermore that 

<5P(si = n) = SP(\\x tn - y tn \\ >6;s 1 >n-l) 

< 5E(d(x tn ,ytn);si > n - 1) 

= 5E(E(d(x tn ,y tn ) \ (z t ( n -i),yt(n-i))); s i > n - 1) 

< aE(||x t(n _i) - y t (n-i)\\;sx >n-l)< a n \\x G - y Q \\, 
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where we used (10) to go from the third to the last line. Hence, assuming 
that ||a?o — 2/o 1 1 < ( 1 — ct)(5, 

n-l 

(12) P(si>n) = l-^P(si=A;)>l-a, 

k=l 

so that P(si < oo) < a < 1. This shows that there is a positive chance that 
the two paths will indeed stay at distance less than 5 from each other for all 
time. 

All of the above calculations were predicated on the fact that xq and yo 
were initially less than (1 — a)5 apart. On the other hand, for n 6 [s^-i, r^), 
Assumption 3 guarantees that there exists a coupling for (£f( n +i),yt( n +i)) 
so that 

p (lkt(n+i) - »t(n+l)ll < (1 - a)5\(x tn ,y tn )) > a, 

for some fixed a > 0. This shows that P(n > n) < (1 — a) n , so that the two 
paths will enter a (1 — a)S ball of each other at a random time which has an 
exponentially decaying tail. We now sketch how to put these observations 
together more formally. 

Let di(x, y) = 1 A \\x — y\\ and define r = inf{£; : s^+i = oo}. We now com- 
bine the above estimates to sketch the proof of the exponential convergence 
to of E d\(x n t, y n t)- There are a few subtle issues arising from the fact that 
r is not adapted to the natural filtration of the process, and we refer the 
interested reader to [20, 38, 43] for examples on how to circumvent these 
technicalities by a specific construction of (x n t,y n t)- Since our goal is only 
to sketch the argument, we do not concern ourselves with these issues here. 

Observe that for any (3 £ (0, 1) 

Ed 1 (x nt ,y nt ) < E(di(x nt ,y nt );r T < n/2) +P(r > f3n) 

+ P(r </3n;r T >n/2). 

The first term decays exponentially fast in n by (11), since the paths are 
guaranteed to be at distance less than 5 on the time interval [n/2,n]. The 
second term is bounded by a 13 " - from the estimate P(si < oo) < a. Recall 
that the parameter (3 is still free. Using the estimates from the preceding 
paragraph, it can be shown that for (5 small enough the probability P(r < 
(3n;r T > n/2) has exponentially decaying tails since the random variable 
Tk+i — Tfc has exponentially decaying tails when restricted to the set where 
s k < oo. 

3. Spectral gap under a Lyapunov structure. There are situations (the 
stochastic Navier-Stokes equations being a prime example) where it is not 
possible to verify Assumptions 2 and 3 in such a uniform way. The present 
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section is an attempt to circumvent this by assuming that the system pos- 
sesses a type of Lyapunov structure that compensates for the lack of unifor- 
mity of these estimates. The relationship between the results of the previous 
section and those of this section is analogous to the relationship between 
Doeblin's condition mentioned in the last section and Harris' conditions 
[16, 22, 40]. While the assumptions given in this section are heavily influ- 
enced by the known a priori bounds on the dynamics of the two-dimensional 
Navier-Stokes equations, we suspect the result will be useful more widely. 

Throughout this section and the remainder of this article, we assume that 
we are given a random flow <&t on a Banach space H. We will assume that 
the map x i— ► <&t(w,x) is C 1 for almost every element u of the underlying 
probability space. We will denote by D<&t the Prechet derivative of $f(o>, a:) 
with respect to x. 

Our first assumption is a strong type of Lyapunov structure on the flow: 

Assumption 4. There exists a continuous function V :7i — > [l,oo) with 
the following properties: 

1. There exist two strictly increasing continuous functions V* and V* from 
[0, oo) — ► [1, oo) so that 

(13) V.Q\x\\)<V(x)<V(\\x\\) 

for all x G TL and such that lim^oo V*(a) = oo. 

2. There exist constants C and k > 1 such that 

(14) aV*(a)<CV?(a), 
for every a > 0. 

3. There exist a positive constant C, ro < 1, a decreasing function £ : [0, 1] — ► 
[0, 1] with £(1) < 1 such that for every h G H with \\h\\ = 1 

(15) EV r (<Z> t {x))(l + \\D$ t (x)h\\) < CV r ^(x), 
for every x G 7i, every r G [ro, 2k] and every t G [0, 1]. 

Remark 3.1. It follows from (15) and Jensen's inequality that there 
exists a constant C such that 

(16) EV r ($ t (x)) < CV r ^ [t] ^~^\x) ^CY*®(x), 

for every t > and every r G [0, 2k] , where [t] is the greatest integer smaller 
than t. In the last equality, we have extended the definition of £ to values of 
t in [0, oo). 
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For r G (0, 1], we introduce a family of distances p r on 7i by 

Pr (x,y)=M /V( 7 (t))||7(t)||^, 

where the infimum runs over all paths 7 such that 7(0) = x and 7(1) = y. 

In the interest of brevity, we will write p for p\. The main consequence 
of Assumption 4 used in this paper is that, via the distance function p r , it 
also induces a kind of Lyapunov structure on the two-point dynamics: 

Lemma 3.2. Assume that <3>t is as above and that Assumption 4 holds. 
Then, for every r G [ro, 1], there exist constants a G (0, 1) and C, K > suc/i 
that 

Ep r ($t(x),$ t (y)) < Cp r (x,y), 

(17) 

Ep r ($ n (x),$ n (y)) < a n p r (x,y) + K, 
for every n G N, every t G [0, 1] and every pair (x,y) G W 2 . 

Proof. It suffices to show the second inequality in (17) for n = 1, since 
the other cases follow by iteration. Fix any e > and fix a curve 7 connecting 
x to y such that 

(18) Pr-0r,y)< /V r (7W)ll7(t)ll^<Pr(x,y) + e 

Jo 

and denote 7 (s) = $4(7(5)) for some i G [0, 1]. We then have 
Ep r ($ t (x),$ t (y))<E /V(7(*))||^)Ns 

<E /V r (7(«))||D$t(7(s))7(s)l|ds 
Jo 

<C f 1 V r ^( 7 (s))\\j(s)\\ ds < Cp r (x,y) + Ce, 
Jo 

where the last inequality uses the fact that £(t) < 1 by assumption. Since e 
was arbitrary and C independent of e this yields the first bound in (17). Let 
now R be sufficiently large so that CV r ^'{x) < aV r (x) for every x with 
I a; I > R. Such an exists since V* tends to infinity. This yields 

(19) Eprfatx), < ap r (x,y) + CV*(R) J 1b(ji)(7(*))I|7(*)I| ds + e, 

where we denoted by B(R) the ball of radius RinTi centered at the origin. 
Note now that 

[ lB(H)(700)ll7toll ds < 2r(^^-J + 6, 
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since one could otherwise replace the corresponding piece of curve by a 
straight line and obtain a value which differed from p r (x,y) by more than 
e. Plugging this into (19) and again recalling the e was arbitrary concludes 
the proof. □ 

Our next assumption is a type of gradient inequality for the Markov 
semigroup Vt on Ti generated by the flow In practice, this inequality 
can be verified if the system is hypoelliptic, in the sense of Hormander (or 
effectively elliptic) and has suitable dissipative properties, but this is a hard 
task in general. (See [21] for a discussion of hypoellipticity and effective 
ellipticity in the setting of the 2D Navier-Stokes equations.) 

Assumption 5. There exist a C\ > and p £ [0, 1) so that for every 
a G (0,1) there exist positive T(a) and C(a) with 

(20) ||ZW(x)|| ^^(^(a) 
for every x E 7i and t > T{a). 

Remark 3.3. While (20) is reminiscent of gradient estimates of the type 
considered in [4] , there does not seem to be an obvious link between the two 
approaches. The main reason is that (20) is really a statement about the 
long-time behavior of Vt whereas the bounds in [4] are statements about the 
short-time behavior of Vt- 

Our final assumption is a relatively weak form of irreducibility: 

Assumption 6. Given any C > 0, r£ (0, 1) and 5 > 0, there exists a T 
so that for any T > Tq there exists an a > so that 

inf sup T{(x',y') G H x H:p r {x',y') < 5} >a. 

\xUy\<CreC(v}S x ,v*Sy) 

The main result of the present article is that under these conditions, one 
has uniform exponential convergence of the transition probabilities Vt(x,-) 
to the (unique) invariant measure of the system: 

Theorem 3.4. Let <&t be a stochastic flow on a Banach space Ti which is 
almost surely C 1 and satisfies Assumption 4- Denote byVt the corresponding 
Markov semigroup and assume that it satisfies Assumptions 5 and 6. Then, 
there exist positive constants C and 7 such that 

(21) p(VU^V* t p 2 ) < Ce-VpfauiMi), 

for every t>0 and any two probability measures p\ and p>2 onH. 
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Since the space of probability measures /ionW such that p(p, 80) < 00 
is complete for the topology induced by p (see, e.g., [49]), (21) immediately 
yields: 

Corollary 3.5. Under the assumptions of Theorem 3.4, there exists a 
unique invariant probability measure p* for Vt- 

Before we turn to the proof of Theorem 3.4, we give a statement that 
is equivalent, but involves the semigroup acting on observables instead of 
the semigroup acting on measures. Since in this setting the semigroup Vt 
possesses an invariant measure /i*, we can define the norm 



An alternative definition of this norm is given in Lemma 4.2 in the next 
section. 

Recall that we also make an abuse of notation by defining the projec- 
tion operator p+ by p*<j) = f^ (f)(y) p ir (dy) . With these notation, we have the 
following statement, which is the dual statement of Theorem 3.4: 

Theorem 3.6. LetVt be as in Theorem 3.4- Then, there exist constants 
7 > and C > such that 



for every Frechet differentiate function (p'.Ti, — > R and every t > 0. 

Remark 3.7. This implies that the spectrum of Vt — p* as an opera- 
tor on the space of Frechet differentiable functions with finite || • || p -norm is 
contained in the disk of radius e -7 * around 0. Alternatively, p+ is an eigen- 
vector for VI with eigenvalue 1. All other eigenvectors have real part whose 
magnitude is at most e _7i . This is the gap referred to in the title of the 
article. 

Proof of Theorem 3.6. Since \\Vt4> — V-*4>\\p = \YPt{<i> — V-*4>)\\p> we 
can assume without loss of generality that p+<f> = 0. The claim then follows 
immediately from the fact that 



where the last inequality follows from Theorem 3.4. Dividing both sides by 
p(x,y) and taking the supremum over x and y concludes the proof. □ 



(22) 




Vt^-pMp^Ce^U-p^W 



\V T c/>(x)-VT<f>(y)\ < \\H P p(V* T 5 x ,V* T 5 y ) <CU\\ p e-^p(x,y) 
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3.1. Proof of Theorem 3.4- The proof of Theorem 3.4 is technically very 
simple but relies on a trick, which consists in considering instead of p a 
distance d which is equivalent to p but behaves like a large constant times 
p r for nearby points and like a small constant times p for points that are far 
apart. 

More precisely, given three constants 5 > 0, r£ [ro, 1) and (5 G (0, 1) to be 
determined later, we define 



However, d is much better than p in capturing the geometry of the bounds 
available to us. This will allow us to proceed in a way similar to Section 2. 
This time, we will consider separately three cases. The first case, p > K*, 
p r > 5, will be treated by using the Lyapunov structure given by Lemma 3.2. 
The second case, p r < 5, will be treated by using the gradient estimate of 
Assumption 5. Finally, the last case, p < K+, p r > 5, will be treated using the 
irreducibility of Assumption 6. Lemma 3.9 is like the first part of the proof 
of Theorem 2.5 and Lemma 3.10 is like the second part. The first makes use 
of the local contraction guaranteed by Assumption 5. The second covers in- 
termediate scales and uses Assumption 6 to ensure that the two points move 
close together some of the time to obtain a contraction estimate. Lemma 3.8 
covers points far from the center of the space. Because of the weighting of 
the distance function by the Lyapunov function, there is contraction if the 
distant points simply move toward the center of the space. 

The following three lemmas provide rigorous formulations of these claims. 

Lemma 3.8. In the setting of Theorem 3.4, there exists a constant K+ 
such that for every 5 > 0, every (5 G (0, 1) and every r G [ro, 1) there exists a 
constant oi\ G (0, 1) such that 



for all n G N. 

Lemma 3.9. In the setting of Theorem 3.4, for any ot2 G (0, 1) there exist 
no > 0, r G [ro, 1) and 5 > so that 




Note that d is indeed equivalent to p since p r < p and therefore 
Pp(x, y) < d(x, y) < ((T 1 + 0)p(x, y). 




p r {x,y)<5 = 
for all u>uq and (3 G (0, 1) . 
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Lemma 3.10. In the setting of Theorem 3.4, for any K+, 5 > 0, r G 
[ro, 1) £/iere exists an n\ so that for any n> n\ there is a j3 G (0, 1) and an 
«3 G (0, 1) so the following implication holds: 

Ed(V:S x ,V* n 6 y )<a 3 d(x,y). 

It now suffices to show that the conditions of all three statements can be 
satisfied simultaneously in order to complete the proof of Theorem 3.4: 

Proof of Theorem 3.4. By the same argument as in the proof of 
Theorem 2.5, it suffices to prove that 

(23) d(V;5 x ,V?S y )<ad(x,y) 

for all (x,y) G H x H. 

We begin by fixing K+ as in Lemma 3.8. We then choose an arbitrary a<i G 
(0, 1) and apply Lemma 3.9 which fixes no > 1, r G [r , 1) and 5 > 0. With 
these in hand, we turn to Lemma 3.10 and fix an N with ./V > max{no,rai}. 
This in turn fixes (3 G (0, 1) and as G (0, 1). Fixing (3 sets the value of a\ in 
Lemma 3.8. Setting a = maxjai, ot2, 03} < 1 completes the proof. □ 

We now turn to the proof of Lemmas 3.8-3.10. 

Proof of Lemma 3.8. It follows from Lemma 3.2 that there exist 
constants a* G (0,1) and > such that 

Ep{$ n (x),<f> n (y)) <a^p(x,y), 

for every (x,y) such that p(x,y) > K*. Since d{V^5 x ,Vn5y) < Ed(<f> n (x), 
$ n (y)) we thus get the bound 

d(V*6 x ,V*6 y )<l + a*0p(x,y). 

On the other hand, p r (x,y) > 5 by assumption, so that 

1 + a±(3p(x, y) = 1 - a* + a* d(x, y). 

Since d(x,y) > 1 + this implies the claim with 

1 + a+pK* 
01 1 + /3K* ' 

which is smaller than 1 (but close to it when j3 is small) by construction. 

□ 

Proof of Lemma 3.9. This lemma is the most delicate of the three 
in the sense that it does not follow from "soft" a priori estimates on the 



p{x,y) < K± 1 
Pr(x,y)>5 J 
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dynamic but requires to make use of the "hard," quantitative bound given 
by Assumption 5. 

For the proof of this result, we use representation (4) for the distance 
d. Notice that we can assume without loss of generality that the test func- 
tions 4> satisfy </>(0) = and are Frechet differentiable, so that the condition 
Lip d (0) < 1 together with (14) implies that 



Combining Assumption 5 with (24) and (16), we see that there exists a 
constant C such that, for every a > there exists C(a) and T\(a) such that 



for every x £7i, every t > Ti(a) and every choice for 5 and (5 in (0, 1]. Now 
fix an arbitrary value for a 3 G (0, 1) and pick a so that aC < 013/2. By (15) 
there exists a T(a) > T\{a) so that K^(t) + p < 1 for all t > T{a). At this 
point, we choose r = maxjro, k^(T(q)) +p} < 1. Using the above estimates 
produces 



where we choose 5 sufficiently small in order to obtain the last inequality 
Fixing any e > 0, let 7 : [0, 1] — > 7i be a curve connecting x and y as in (18) 
with r = 1 . We have 



= a 3 5 p r {x,y) +ea 3 5 < a 3 d(x , y) + ea 3 <5 , 

where the last inequality uses the fact that we are in the case p r (x,y) < 5. 
Since e was arbitrary, the proof is complete. □ 

In order to be able to prove Lemma 3.10 and thus conclude the proof of 
Theorem 3.4, it is essential to know that the region corresponding to the 
third case is a bounded subset of H x H. This is given by the following 
result: 

Lemma 3.11. Suppose that V is as above and define, for some constants 
5 > and K > 0, the set 

C = {(x,y): p r (x, y)>5 and p(x, y) <K}. 

Then, there exists an R > such that \x\ V \y\ < R for every (x,y) E C. 



\W(x)\\ <(<5- 1 + /3)F(x) 



(24) 



Ws)| < l + P\\x\\V*(\\x\\) < 1 +(3CV:{\\x\\). 



DVt<p{x)\\<CV K ^ t)+p {x){C{u)+u5~ l ), 
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Proof. Note first that if S/K > V7~ 1 (0), the set C is empty and there 
is nothing to prove. 

We now show the contrapositive of the statement, that is, there exists 
R > such that if \x\ > R and p(x,y) < K, then p r (x,y) < 5. Fixing any 
e > 0, let 7 denote a curve connecting x to y as in (18) with r = 1. Since 
p(x, y) < K and V > 1, 7 never leaves the ball of radius K + e around x. We 
thus have the bound 

Pr(x,y)< /V(7( S ))||7(s)|Ms< f sup ^" 1 (z)V /) (x,y) + e) 

^0 Vz: |2-x|<_ftT+e / 

= ( inf F(z)Y (if + e). 

\z:\x-z\<K+e J 

Since e was arbitrary and V is continuous, the bound holds for e = 0. It 
follows from (13) that if one chooses R = K + V" 1 ((5 / K) 1 ^ 7 "" 1 ^ ) , one has 

\z:\x-z\<K J K 

for every x with \x\ > i?, which concludes the proof of the statement. □ 

With this fact secured, we are in the position to give the proof of Lemma 3.10. 

Proof of Lemma 3.10. Given K+, 6 and r G (0,1), it follows from 
Lemma 3.11 that there exists a C*(-fC*,5, r) so that 

C d = {(x,y):p r (x,y) > S,p(x,y) < K*} C {(x,y) : \\x\\, \\y\\ < C*}. 

Hence by Assumption 6 for every T large enough there exists a positive 
constant a so that for any (xo,yo) 6C there exists a coupling (xT,yr) of 
($t(x ), ®T(yo)) such that 

P(/0r(£T,yr) < \5) >a>0. 

Clearly a is independent of the choice of (3. Note now that there exists a 
constant C such that, for every z £Tt, 

p(z,0) < f 1 V{sz)\\z\\ds < \\z\\V*(\\z\\) <CV K {z). 
Jo 

Hence it follows from (15) that there exists a constant C* (also independent 
of (3) such that Ep(x T ,y T ) < Ep(x T , 0) + Ep(y T , 0) < C* for all (x ,y ) G C. 

As before, given a random variable X and an event .A, we define E[X; A] = 
E[X1 A \. Now 

Bd(x T ,y T ) =E(l A Pr(X ^ T) ;p r (x T ,y T ) < itf) 
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+ {3Ep(x T ,y T ) 



<\ + \l>{Pr{xT,vr)>\s)+PC* 
<\ + \(l-a) + pC* = l- l -a + (3C\ 



By making (3 small enough we can ensure that the right-hand side is less than 
1. We denote this number by a.3. Since p r (x,y) > 5 we know that d(x,y) > 1 
and hence 



which is the quoted result. □ 

4. Quasi-equivalence of norms. In the finite-dimensional setting where 
a Lyapunov function exists, it is natural to consider the norm on functions 
given by 



(See, e.g., [40].) The norm on measures associated to it by duality is a 
weighted total variation norm. This norm can still be used in the infinite- 
dimensional setting provided that the driving noise is sufficiently nondegen- 
erate; see, for example, [9] for a general theory and [19] for a recent ergodicity 
result on the stochastic two-dimensional Navier-Stokes equation. 

In the present article, we are, however, interested in the situation where 
the driving noise is very degenerate. Indeed we assumed, for our main ex- 
ample of interest, that the driving noise is finite-dimensional, whereas the 
state space of our system is of course infinite-dimensional. In this setting, al- 
though it is possible to show that topological irreducibility holds, we do not 
expect the corresponding Markov process to be -0-irreducible for any mea- 
sure tp. This is because, even though the system is formally hypoelliptic, we 
consider it very unlikely that it has the strong Feller property. It is indeed 
very simple to construct infinite-dimensional Ornstein-Uhlenbeck processes 
where the noise acts on every degree of freedom (so that the system is for- 
mally elliptic), but the system nevertheless lacks -0-irreducibility. Therefore, 
the results from [40] are not applicable to the present situation and we do 
not expect to be able to get convergence results in the total variation norm. 



Ed(x T ,y T ) < a 3 d{x,y), 



(25) 
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It is therefore natural to look for a modification of (25) to the Wasserstein 
setting. 

Motivated by these considerations, we introduce the following family of 
norms: 

10(301 + Pty(s)|| 

\m\v = sup — — . 

xeH V'[x) 

When we take r = 1, we will simply write \\(f>\\v ■ The remainder of this section 
is devoted to showing that, modulo the semigroup Vt, these norms can be 
considered to be equivalent to the norms || • \\ Pr introduced in (22). Once 
this has been shown, we will have that Theorem 3.6 holds with the || • \\ p 
norm replaced by the || • ||y norm defined above. This result is contained in 
Corollary 4.4. We begin by showing that the norm || • |L r is bounded from 
above and from below by the || • \\y r i norm for a choice of r' not necessarily 
equal to r. 

Proposition 4.1. There exists a constant C such that 
C- l U\\v^ <U\\ Pr <CU\\yr, 
for every r £ [0, 1] and every Frechet differentiable function <j>. 

Note first that: 

Lemma 4.2. Recall the definition of \\ ■ \\ Pr from (22) and let (p:TC R 
be Frechet differentiable. Then one also has 

(26) U\\ Pr = sup + / m^idx). 

Proof. Since 

■>m sup mp*m.E*m, 

e ^°y. \\y-x\\<e Pr(x,y) V r (x) 

\\4>\\ Pr is greater than or equal to the right-hand side in (26). In order to 
prove the reverse inequality, we can assume without loss of generality that 
/ cj)(x)iJ,*(dx) = and ||Z?0(;e)|| < V r (x) for all x. One then has 

|0(x)-0(y)|= /Wt(s)),7(s)M S < /V( 7 ( S ))||7(s)M S , 
Jo Jo 

for any smooth path 7 connecting x to y. Taking the infimum over all such 
7 proves the claim. □ 
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Proof of Proposition 4.1. We start with the second inequality. It 
follows from Lemma 4.2 that it suffices to show that there exists C > such 
that 

' <t>{x)^{dx)<C\\<t>\\v. 

This follows immediately from the fact that V is integrable against /u+ by 
(15). 

In order to show that the first inequality holds, fix with \\<f>\\p T = 1. One 
then has 

|0(z)-0(O)| < Pr (x,0)<CV Kr (x), 

where the second inequality follows from (14). Furthermore, / p r (x, 0)p+(dx) < 
J p(x,0)p it (dx) = C. This yields 



< / |0(x)- 0(0) Mete) <C, 



<p(x)/^{dx) - <f>(0) 

so that |0(O)| < C + ||0|| Pr < C + 1. Combining these bounds, we get 

|0(x)| < |0(O)| + |0OzO- 0(O)| <CV Kr (x), 
for some C > 0, which completes the proof. □ 

We now show that the semigroup Vt has the following contraction prop- 
erties: 

Theorem 4.3. There exist constants C and 7 such that, for every r G 
[ro,ft], every Frechet differentiable function and every t>0, one has the 
bounds 

WVtHyW <Ce^U\\ V r, \\Vt4>\\ PT{t) <Ce^||0|| Pr) 
where r{t) = max{£(t)r, ro}. 

Proof. It suffices to show the claims for t E [0, 1] since the other cases 
follow by iteration. To begin with, we get bounds on the common term in 
both norms: 

\\DVt4>{x)\\<n\D(l>{^t{x))\\\\D^ t {x)\\ 

where we made use of (15) in the last inequality. On the other hand, we 
have 
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and, from the invar iance of 

[v t (f>(x)fj, ir (dx) = J <f>{x)n+{dx). 
Combining these estimates proves the quoted results. □ 

Corollary 4.4. There exist a time T and a constant C such that 

\\VT<t>\Wr<CU\\ pr , 

for every Frechet differentiable function <f> and every r G (e/(l — £(1)), !]■ 

Proof. Let r n = £(l) n Kr + e(l - £(l) n )/(l - f(l)) as above. Then, we 
get from Theorem 4.3 and Proposition 4.1 that 

\\Vn<f>\\ V r n <C n \\cf>\\ V «r <KC n U\\ pr , 

for some constants C and K. Since we assume that r > e/(l — £(1)) = lim n , r n , 
there exists m such that r m <r. The fact that ||^>||v r < ||0||v r ™ completes 
the proof. □ 

An immediate consequence of Corollary 4.4 is the following result which 
states that Theorem 3.6 holds with || • \\ p replaced by || • \\v- 

Theorem 4.5. LetVt he as in Theorem 3.4- Then, there exist constants 
7 > and C > such that 

WVtt-frHv^Ce-VW-HttWv, 
for every <j)£ B and every t > 0. 

5. Application to the 2D stochastic Navier Stokes equations. We now 

apply the results of the previous sections to the two-dimensional Navier- 
Stokes equations on the torus T 2 , which is our main motivation for the 
present work. Recall that, in the vorticity formulation (1), these equations 
are given by 

(27) dw = uAw dt + BQCw, w) dt + fdt + Q dW(t), 

where B(u,v) = —(u ■ V)i> is the usual Navier-Stokes nonlinearity, W is a 
cylindrical Wiener process on 7i, and Q:TL^>TL is a positive self-adjoint 
finite rank operator commuting with translations. The viscosity v > is ar- 
bitrary. We use the notation laid out in the Introduction. In particular, we 
denote by e^, k £ Z 2 the eigenfunctions of A and by qk the correspond- 
ing eigenvalues of Q. Unless indicated otherwise, we will assume that the 
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constant component / of the body force and the coefficients qk satisfy As- 
sumption 1. 

It is well known (see, e.g., [8, 17]) that (27) has a unique solution un- 
der much weaker assumptions on the covariance operator Q. It is also well 
known that under similar conditions, (27) has an invariant measure /i*. 
The uniqueness of this invariant measure is a much harder problem and 
has been a field of intense research over the past decade. Early results 
can be found in [9, 17, 36]. Until recently, the consensus that emerged in 
[5, 6, 14, 20, 28, 34, 37, 38] was that the uniqueness of the invariant measure 
can be obtained, provided that all the qk with \k\ 2 < N are nonzero, for 
some value iV~^g|/i/ 3 . To the best of the author's knowledge, the only 
exception to this were the results of [15], that indicated that the invariant 
measure /i* should be unique provided that there exist R > and a large 
enough such that all the qk with \k\> R are bounded from above and from 
below by multiples of The uniqueness problem was eventually solved 

under Assumption 1 by the authors in the recent article [21]. This assump- 
tion is close to optimal since it only fails in situations where there exists a 
closed subspace 7i C TL that is invariant for (27). It can then be shown that 
there always exists a unique ergodic invariant measure /x* for (27) such that 

We will show in this section that under Assumption 1, the random flow 
generated by the solutions of (27) satisfies the assumptions of Theorem 3.4 
with V{w) = exp(r7||u>|| 2 ) for a positive rj sufficiently small. We will then 
exhibit a Banach space of observables B which is such that the semigroup 
Vt generated by (27) extends to a strongly continuous semigroup of operators 
on B. The results from Theorem 3.4 will then be shown to imply that the 
operator norm of Vt converges to 0, so that in particular its generator C 
has a spectral gap in the sense that there exists a constant g > such 
that the spectrum of C is contained in {0} U {Re A < —g}. We conclude by 
showing first that C acts on cylindrical function as a second-order differential 
operator as one would expect and then that all the structure functions for 
(27) converge exponentially fast to their limit values. 

5.1. General Lyapunov structure. We start with a result that we have 
found to be very useful when trying to check that (15) holds for a particular 
system. 

Lemma 5.1. Let U be a real-valued semimartingale 

dU{t,uj) = F(t,u)dt + G(t,u;)dB(t,u), 

where B is a standard Brownian motion. Assume that there exist a process 
Z and positive constants &i,&2>&3; with 62 > such that F(t,u) < b\ — 
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biZ(t,uS), U(t,uj) < Z(t,oj) and G(t,u>) 2 < bsZ(t,u) almost surely. Then, 
the bound 

Eexp(W) + ^^ f t Z(s)ds) < W^/M exp(^(0)e-^/^) 
holds for any t>0. 

Proof. Fixing a time t > and a > 0, set 

Y(s) = exp (| ( S - t)) J7( S ) + | jT exp (|(r - i)) Z(r) dr 
and M(s) = / s exp(^(r-t))G(r,w)d5(r,c 1 ;). Then 

dY(s) = expf j(s - t)^ (V(s,cj) + j (17(a) + (is + dM(s). 

If we restrict to s £ [0,t], then we have that 

^(a) < ^(0) + ^ - ^ f exp (%r - t)) Z(r) dr + M(s). 

02 2 JO V 4 / 

Next observe that Y(0) = exp(-^t)[/(0), F(i) > E/(t) + ^ e " 4 ' 2t/4 f Q f Z(s) ds 
and 

M(s) - | jT exp (|(r - t)) Z(r) dr < M( S ) - i|(M) (s) 

because exp(^(r — t))G(r 2 ) < exp(^-(r — t))Z(r) almost surely for r G [0, t]. 
Since for continuous local martingales M(t), one has the exponential mar- 
tingale inequality 

p(su P M(s) - |(M)( S ) >pj <exp(-a/3), 

we have that 

In order to conclude, it now suffices to use the fact that if X is an arbitrary 
random variable and a > 1 is a constant such that P(X > if) < exp(— aK) 
for every K > 0, then Eexp(X) < a/(a — 1). □ 

5.2. Verification of the assumptions of Theorem 3.4- We first show that 
Lemma 5.1 indeed implies that: 

Proposition 5.2. There exists 770 such that, for every rj £ (0,770]; 
solutions to (27) satisfy Assumption 4 with V(w) = exp(7/||u;|| 2 ). 
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Proof. It is clear that V satisfies (13) and (14) so that it remains to 
show that (15) holds. Note that if we set U(t) = r)\\w\\ 2 , we have from Ito's 
formula 

dU(t) = r?(tr Q 2 + 2(w(t)J) - 2v\\w(t)\\l) dt + 2r)\\Qw(t)\\ dB(t), 

for some Brownian motion B. Here and in the sequel, we denote by ||io|| 
the L 2 -norm of it; and by = ||Vio|| its H^-norm.. Since ||io||i > \\w\\ 

and 2(w,f) < i^ -1 !!/!! 2 + ^||ii>|| 2 , this shows that we are in the situation of 
Lemma 5.1 if we set Z{t) = f/||w(i)|| 2 and 

6 1=r/ trQ 2 + ffi, b 2 = u, 6 3 = 4r ? ||Q||. 

In particular, this shows that, for every rj < v j (4||Q||), there exists a constant 
C such that, for every t S [0, 1], 

(28) Eexp(^|Ht)|| 2 + UTie 2 V ' 2 J\\w{s)\\\ds^ < Cexp(7 ? ||u;(0)|| 2 e-^/ 2 )). 

On the other hand, we know from Lemma A.l that, for every k > 0, there 
exists a constant C such that 

\\D<5> t (w )\\ < Cexp^K^ |Hs)||?<te) Wte[0,l], 

holds almost surely for every w £ TC. Combining this with (28) shows that 
(15) holds with £(i) = er vt l 2 for arbitrarily small values of tq. □ 

Recall now that the following "gradient estimate" is the main technical 
result of [21]: 

Proposition 5.3. For every r/ > and every a > 0, there exist con- 
stants C V)a such that, for every Frechet differentiable function 4> from TC to 
R, one has the bound 

\\DPn<K«>)\\ <eMr}\\M\ 2 ){C v , a ^(VnW){w) + a n y/ {V n \\DW){™)l 
for every w £ TC and n £ N. 

Remark 5.4. The works [21, 39] made the assumption / = 0. However, 
the arguments presented there work without any modification under the 
assumption that / G range Q. Note, for example, that Girsanov's formula 
implies that the transition probabilities for (SNS) with / = are equivalent 
to the transition probabilities with / G range Q. In particular, this means 
that the proof of weak irreducibility from [21] carries over to the setting of 
this paper. 
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Proposition 5.3 immediately implies that Assumption 5 is satisfied for 
every choice of r\, so that it remains to verify Assumption 6. This, however, 
follows immediately from [13], Lemma 3.1, and Remark 5.4 above. As a 
consequence, we have just shown that: 

Theorem 5.5. If Assumption 1 holds, there exists rjo > such that, for 
every i] < t]q, the stochastic flow solving (27) satisfies the assumptions of 
Theorem 3.4 with V(w) = exp(?7||u;|| 2 ). Hence, the conclusions of Theorems 
3.4, 3.6 and 4.5 hold. 

5.3. Spectral gap for the generator. In this section, we show that it is 
possible to extend the Markov semigroup Vt generated by solutions to (27) 
to some Banach space of observables B in such a way that: 

1. The semigroup Vt is strongly continuous on B. 

2. There exists g > such that o~(Vt) \ {1} is included in the disk of radius 
e~ gt for every t > 0. Here, o~(Vt) denotes the spectrum of Vt viewed as a 
bounded operator on B. 

Remark 5.6. It follows from standard semigroup theory that the above 
statements imply that Vt possesses a generator C densely defined on B (see, 
e.g., [10], Theorem 1.7) and that there exists g > such that Re(A) < —g 
for every A € cr(C) \ {0} (see, e.g., [10], Theorem 2.16). 

Before we give the precise statement of our results, let us turn to the 
construction of the Banach space B. Given a Hilbert space TC, we define 
C °°(W) by 

Cg° (H) = {(/.on|n:H^R n linear, 0ECg°(R n )}. 

Note in particular that elements of C^(TC) are Frechet differentiable of all 
orders. Given rj > 0, define B rj as the closure of Cq d (TC) under the norm 

(29) Mr, = sup exp(-r ? ||^|| 2 )(|^H| + \\D</>(w)\\). 

We also denote by B v the closure under this norm of the space of all Frechet 
differentiable functions <f> such that H^Hjj is finite. 

Remark 5.7. The space B v is much smaller than B r] . In particular, 
elements of B^ are continuous when TC is equipped with the topology of 
weak convergence, so that w 1— > \\w\\ 2 does not belong to B^, even though it 
obviously belongs to B r] . However, w > \\Kw\\ 2 does belong to B^, provided 
that K : TC — > TC is a compact operator. 
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Remark 5.8. The fact that the vorticity belongs to Ti = L 2 does not 
ensure that the corresponding velocity field is continuous. Therefore, point 
evaluations of the velocity field do not belong to B v . This fact can, however, 
be dealt with and we will do so in Section 5.4. 

Remark 5.9. Given an orthonormal basis {e n } oiTi, one could have re- 
stricted oneself to the set of all functions of the type w \— > 4>({w, e±), . . . , (w, e n }) 
with cj) G Co°(R n ). It is easy to check that the closure of this set under the 
norm (29) is again equal to B v , independently of the choice of basis. 

As a consequence of this, it is a straightforward exercise to check that 
polynomials in (w,e n ) with rational coefficients form a dense subset of B v , 
so that it is a separable Banach space. 

The first result of this section is the following: 

Theorem 5.10. For n sufficiently small, Vt extends to a Cq- semigroup 
on B v . 

Proof. Define il n as the orthogonal projection in Ti onto the first n 
Fourier modes. The proof of this result is broken into two distinct steps as 
follows: 

1. The semigroup Vt extends to a semigroup of bounded operators on B v 
that is uniformly bounded as t — > 0. 

2. One has \\Vt4> — 0|U — ^ as t — > for a dense subset of elements of B v . 

Note first that it follows from the a priori bounds of Lemma A.l that 
if 4>: Ti — > R is a Frechet differentiable function such that H^H^ < oo, then 
Vt4> is again Frechet differentiable and there exist constants Ct that remain 
bounded as t — > such that 

\\Vt<f>\\r,<CtU\\ Vl 

provided that n is sufficiently small. This shows that Vt can be extended to 
a semigroup on B^ which is uniformly bounded as t — > 0. 

Since the norm on B^ is the same as on B^, the first claim follows if we 
can show that Vt maps B v into itself. For an arbitrary function (j) G Cg°('H), 
we will show that 

(30) lim \\Vt(t>-{Vt4>)oTl n \\ ri = ^ 

where H n denotes the orthogonal projection in Ti onto the Fourier modes 
with | A; | < n. This is sufficient since it follows from the a priori bounds (46), 
(43), (40) and (42) that the function (Vt<f>) °n n is twice Frechet differentiable 
and that, together with its derivative, it grows more slowly than exp(?7||j;|| 2 ) 
at infinity, so that it belongs to B fj . 
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Fix a generic element w 6 Ti and a natural number n > 0, and write 
■u) = n n u). We denote by &t the random flow solving (27) and set wt = $t(w), 
w t = pt = wt — wt- We also use the notation 

J t = (D*t)(w), J t = (D* t )(«;), J P ,t = Jt-Jt. 

Since the derivatives of (ft are bounded, the expression inside the limit in 
(30) is bounded by 



C sup e-"HI 2 (E||p*|| + JEjp^kJj^ + E||J p , t ||). 

The claim then follows immediately from Theorem A. 3 and from the a priori 
bounds of Lemma A.l. 

In order to show that the second claim holds, fix a function (ft £ Cq°(TC) 
which is of the form (ft = (ft o n n for a Cq° function (ft and some n > 0. It is 
straightforward to check that there exists a constant C (depending on (ft) 
such that 

\\r t 4> - (ft\U < Csupe-^ii^Eiin^t-n^n+Eiin^Jt-^ii) 

= Csup e-^\G 1 (t) + G 2 (t)). 
wen 

Since n is fixed, both terms are relatively easy to control in the limit t — > 0. 

Let us first bound G\ (t). It follows from the variation of constants formula 
(or the mild formulation of a solution) and (37) from the Appendix that 

G x (t)< ||(1 -n„e" At )|M|+E f t n n e vA ^B(]Cw s ,w s )ds 

Jo 

<(l-e- w2 *)H| +Cn 3 f E\\IL n B(1Cw s ,w s )\\- 3 ds 

Jo 



< (l-e- un2t )\\w\\ +Cn 3 f E\\w s \\ 2 ds. 

Jo 





Since n is fixed, it is obvious that the first term converges to as t — > 0. 
By (41), E||u> s || 2 is uniformly bounded in time by C exp (77 1 1 1 1 2 ) . Hence the 
second term is bounded by C exp (77 1 1 w 1 1 2 ) t and thus converges to as t — > 0. 

The term G2(s) is bounded in much the same way. Again it follows from 
the variation of constants formula that 

n n J t C = U n e uAt C + f e" A( *- s) n n (S(/C J,e, w a ) + B(lCw s , J s £)) ds. 
Jo 

It follows from (37) that one has the almost sure bound 

||n n J t -n n || <\-eT vn * + Cn 3 / ||ii>J|||JJ|ds. 
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Taking expectations, the needed bound showing that — > as t — > 
follows from Lemma A.l and the same reasoning as used for Gi(t). □ 

Since the semigroup Vt is strongly continuous on B„, it has an infinitesimal 
generator C Ito's formula allows us to show that C is an extension of some 
concrete second-order differential operator: 

Lemma 5.11. Let C be the generator of Vt on B v and let (f> G B v be of 
the form <j)(w) = 4> ° n n for some n and some function <j> G Co°(R n ). Then 
4>eV(C) and 

(C<j)){w) = u(AD(j)(w), w) - (B(Kw,D<j>(w)),w) 

(31) 

+ (f,D^(w)) + ltr(QD 2 4>(w)), 

for every w G TL. 

Proof. Fix a function (f> as in the statement of the lemma. Note first 
that D(j){w) G 'D(A) so that (31) does indeed make sense for every w G 7i. 
One has 

T\. n w t = v [ AU n w s ds+ [ U n B(K.w s ,w s )ds + QW(t), 
Jo Jo 

so that Ito's formula immediately implies that 

(32) V t cf>(w) - <f>(w) = f* V 8 C(j){w) ds, 

Jo 

where Ccp is given by (31). Let us show that C(f> G B v - The only term in 
(31) for which this is not immediate is the one involving the nonlinearity B. 
Since D(p(w) = D(p(Il m w) for m>n, one has the bound 

\(B{Kw,D<f>{w)),w) - (B(KJl m w,D^(U m w)),IL m w)\ 

< | (BQCw - ICU m w, Dcf>(w)),w) | + | (B(/CIl m w, D</>(w)),w - U m w) \ 

< (7]|x;(«; - ru^initoHH^Cti;)!!! + cKwiid^MHallw -iimwll-i 
<-IMI 2 , 

n 

and similarly for its derivative. The penultimate inequality in this equation 
is obtained by making use of the bound ||5(/Cio,u;)||i < C||u;|| 1 1 iz) 1 1 3 - The 
result then follows from (32) and the fact that Vt is strongly continuous. □ 



SPECTRAL GAPS IN WASSERSTEIN DISTANCES 



31 



5.4. Convergence of structure functions. In this section, we show that 
if 4> '■ H l — > R is a smooth function with at most polynomial growth, then 
there exist constants C, rj and 7 (with only C depending on <f>) such that 



(33) 



{Vt4>){w)- <t>(w) ii*{dw) 

H 1 



< CVlMI ; 



In particular, since w £ H 1 implies that v G H 2 C C(T 2 ,R 2 ), polynomials of 
point evaluations of the velocity field fall into this class of observables. 

It follows from the results of the previous section that (33) is an immediate 
consequence of the following result: 

Proposition 5.12. Let N > and let <p : H 1 — > R be a smooth function 
with 

....... \</>(w)\ + \D</>(w)\ 

U\\\n= su P ; ' I, m <0Q - 
weH l 1 + IMI1 

Then, for every t>0 and every n > one has Vt4> ^B^. In particular there 
exist constants C^,t such that \\Vt4>\\ri < CW,i|||</>||| ;v- 



Proof. Fix arbitrary values for t > and rj > 0. Let ti)GH and let wt 
denote the solution to (SNS) starting at w. One then has 

\Vt<Kw)\ < Ml N m + \\wt\\?) < Cexp^l^H 2 )^^, 

where the second inequality follows from (41). One furthermore has, for an 
arbitrary vector £ G 7i, 

\DPt</>(w)t\ = \BDct>(w t )Jo4\ < |||0||U(E(1 + II^IIO^EH J 0ii e|| 2 ) 1/2 

^Cexp^ll^H 2 )^^, 

where the last bound was obtained by combining (41), (44) and (40). The 
claim follows immediately from these two estimates. □ 



5.5. Regular dependence on the parameters. In this section, we present 
one possible application of the results obtained in this article. It was shown 
in [21] that, for a large class of parameters v, Q and /, (SNS) has a unique 
invariant measure /U*. One question which was not addressed was the nature 
of the dependence of /x* on these parameters. The results obtained in this 
article enable us to give a relatively simple argument that shows that //* 
depends in a continuous way on all the parameters involved. In [32], Majda 
and Wang proved that in the setting where the dissipation dominates the 
dynamics, the invariant measure depends continuously on the viscosity. This 
is a reflection of the fact in this context the random attractor consists of 
a single point (see [32, 36, 38]). Hence the continuous dependence of the 
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invariant measure follows from the continuous dependence of the random 
attractor. This can be found in [45] in an abstract setting and [32] in this 
setting. In the setting of this article, the random attractor is not necessarily 
a single point, hence results for the attractor do not translate to results for 
the invariant measure. Nonetheless we show that the long-term statistics of 
the equations with nearby parameters are near to each other. In particular, 
our results hold even when the viscosity is not large relative to the typical 
scale of the energy of the forcing. 

In order to keep the notation at a bearable level, we introduce the pa- 
rameter space A = R + x l 2 ^ xTi and we denote its elements by 

a = {v,Q,J)- 
We equip A with the natural distance given by 

d(a,a) 2 = \v- v\ 2 + \\Q - Q\\ 2 + \\f-ff. 

We denote by Ao the subset of A that satisfies Assumption 1 . For every a G 
Ao, we denote by the unique invariant measure for (SNS) with parameters 
a and by Vf the corresponding semigroup. For a £ A, fj," will simply denote 
any probability measure invariant, not necessarily unique, for (V")* . One 
then has the following regularity result: 

Theorem 5.13. For every a € Ao, there exist 77 > 0, e > and C a > 
such that 

drii&iVZ) < C a d(a,a), 
for every a € A with d(a, a) <e. 

Remark 5.14. Going carefully through the proofs of the results in this 
article and keeping track of the dependence of all a priori estimates on the 
parameters, we believe that one can show that it is possible to choose for 77, 
e and C a continuous functions of a. The main obstacle to this program is 
to recover the bounds of [39] under weaker assumptions on Q. 

Remark 5.15. Even though Ao is dense in A, this result does not allow 
to conclude anything about the set of invariant measures for a £ Aq. One 
would expect that there exist values of a such that (SNS) with parameters 
a has more than one invariant measure. This would then necessarily imply 
that Cp > l/d(a,(3) for j3 G Ao close to a. 

Theorem 5.13 is the result of the following meta theorem. Given two 
Markov semigroups, if one is uniformly ergodic and the other is close to 
the first on O(l) time intervals, then any invariant measure of the second 
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is close to the unique invariant measure of the first. Theorem 1.4 gives the 
needed ergodicity for a £ Aq. The closeness of the time t transition densities 
is given by Corollary 5.17 below. It follows from the following bound on the 
difference between solutions to (SNS) with different sets of parameters: 

Proposition 5.16. Let wq £ TL and, for any two sets of parameters 
a and a, let us denote by Wt the solution to (SNS) starting at wq with 
parameters a and by wt the solution starting at wq with parameters a. 

Then, for every a £ A, there exist rjo > and e > such that, for every 
r] < t]q there exist 7 > 0, and C > so that 

E|K - w t f < Ce lt+ ^ W °W 2 d(a,a) 2 , 
for every a € A with d(a, 5) < e. 

We now use this result to prove the needed estimate on the closeness of 
the time t dynamics. 

Corollary 5.17. For any a G A there exists an 770 > so that for any 
77 < 770 there exist 7 > 0, e > 0, to> and C > so that one has 

d v ((KT^ CPt )» < Celt ol) I e"IHI a /i (dw) 

Jn 

for any measure /i onH, t> to and a £ A with d(a, a) < e. 

For brevity in the sequel, we will simply write Vf* for iVi)*. 

Proof of Corollary 5.17. First note that, for every pair (w,w) in 
Ti. and for every 77 > 0, one has the upper bound 

(34) d v (w,w)< \\w-w\\(e^ 2 +e^ 2 ). 

Fix now a > 0, let e be as given by Proposition 5.16, and choose an arbitrary 
a £ A with d(a, a) < e. Using the notation of Proposition 5.16, we have for 
7] sufficiently small 

dr,(Vt5 W0 ,VtS W0 ) <Ed v (w t ,w t ) < (EII^-^fECe*" 2 +e 2 «l 2 ))V2 

< C d{a, a) exp Ut + | |KI| 2 + r^H^)*)/ 2 ) ||«;o|| 2 ) . 

This shows that there exist constants to, 7 and C such that 

dJV^n,Vt^)<Cd(a,a)e^ [ p(dw), 

Jn 

for every t>to- By Remark A. 2 we can choose the constants uniform over 
all a with d(a,a) < e. □ 
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With Corollary 5.17 in hand, we return to the proof of Theorem 5.13. 

Proof of Theorem 5.13. We know from Theorem 5.5 that there ex- 
ists ti such that 

for every t > t±. Let to be as in Corollary 5.17. Choosing t = max{£o,ti}, we 
have 

<±d(p«,rf) + d(a,a)e* f e^' ^{dw). 

Jn 

Notice that in (28) the constants on the right-hand side of the estimate 
depend contiguously on the parameters for a S A. Hence it follows from 
(28) that, for 7] sufficiently small, Jf^e n ^ w ^ p"(dw) is bounded uniformly for 
all a with d(a,a) < e, so that the claim follows. □ 

We close this section with the proof of Proposition 5.16, which amounts to 
the continuous dependence on the parameters in A of the solution operator 
of (SNS). 

Proof of Proposition 5.16. Define p t = w t — wt, 5 v = v - u, 6f = 
f — f and 5q = Q — Q. One then has 

d Pt = (uApt + S v Aw t + B(]Cw t , p t ) + B{JC Pt , w t ) + 5 f )dt + 5 Q dW. 

At this point, we introduce the stochastic convolution 

*t= f'e^-^SqdWis), 
Jo 

and we set pt = pt — • This yields for p 

\d t \\p t \\ 2 = -v\\pt\\l ~ S v {Vpt,Vw t ) + (B(K.p t ,wt),Pt) 

+ {B(Kwt,*t),Pt) + {B(JC%,wt),pt) + (5 f ,pt). 

Fix now r\ > 0. Making use of (36), we see that there exists a universal 
constant C such that 

dt\\pt\\ < -v\\Pt\\i + — INIIi + ^ll^llill^lli^llPtll 

+ y(IKII? + KII?)INI 2 + ^11**11? + <*/./*>■ 
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Note now that it follows from Holder and Young's inequalities that there 
exists a universal constant C such that 

_ n || _ I, ii_n ~ ii- n 2 _ ..o,. 1 1 2 C ii- n 2 

C N 1 Pt 1/2 < V \\pt i + — \\Wt i pt +-n-o \\Pt\\ ■ 

Combining these bounds yields 

dt\\pt\\ 2 < (l + ^3 + ^(IHIi + |HI?)) IIaII 2 

+ ll*/ll a + -ll*t||? + ^W?. 

We can now apply Gronwall's inequality to get the bound 

\\pt\\ 2 < exp^l + -^3^* + W J q (\\ w s\\i + \\w s \\j)ds 



C ft 5 2 /"* 

\ s f\\ 2t + — / + — / ||r2 a ||ida 

J T)V Jo V Jo 



Using the bound x < a~ 1 e ax , applying Cauchy-Schwarz and using the fact 
that there exists a universal constant C such that, for every Gaussian ran- 
dom variable taking values in a separable Hilbert space, one has 

E||A1 4 <C(E||A1 2 ) 2 , 

we eventually get that there exist constants C and 7 depending continuously 
on 7] and on the parameters a and a such that, for every 77 sufficiently small, 
one has the bound 

E\\ptf < Ce^ t+ « 2 (sl + \\Sff + J*B\\* s \\ids^ . 

The claim then follows immediately from the fact that 

11 tin- 2u , 

for every t > 0. □ 



6. Discussion. We have proven a spectral gap in a Wasserstein distance 
for a class of Markov processes satisfying a gradient estimate and a weak 
(topological) irreducibility assumption. Measuring convergence in a Wasser- 
stein metric allows one to incorporate information about the pathwise con- 
traction properties of the system. When the system is completely pathwise 
contracting, the story is relatively straightforward; see [36, 38] or [25] for 
the finite-dimensional setting. However, when the system is not pathwise 
contracting one must introduce a change of measure to make it contracting. 
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This was one of the central ideas used in [14, 37]. The term in the gradient 
estimate which does not have a derivative reflects the probabilistic cost of 
this change of measure while the term with a gradient but a coefficient less 
than 1 reflects the contraction property obtained via the change of measure. 

When the gradient estimate is not uniform, the existence of a Lyapunov 
function is required. The convergence is then measured in a Wasserstein 
distance weighted by the Lyapunov function. In this "Harris-like" setting, 
the contraction properties of the system arise from two sources. Points close 
to the center of the phase space, as measured by the value of the Lyapunov 
function, contract due to the combination of deterministic contraction and 
probabilistic mixing captured by the gradient estimate. Points far out in the 
space move closer to each other in the distance weighted by the Lyapunov 
function simply because the linear instability of the flow is compensated by 
the decrease of the values of the Lyapunov function as the solution moves 
points toward the center of the phase space. 

While we have applied our general theory to the single example of the 
stochastic Navier-Stokes equations with degenerate forcing, we believe that 
these results will be useful in many contexts. The gradient estimate allows 
to capture the combination of mixing due to the presence of noise and due to 
the contractive nature of the dynamic in one simple estimate. In the context 
of degenerately forced dissipative SPDEs, control of the gradient term on the 
right-hand side of Assumption 5 combines an argument strongly inspired by 
the probabilistic proofs of Hormander's theorem [24] based on Malliavin's 
calculus [33, 41, 48], together with the infinitesimal equivalent of the Foias- 
Prodi-type estimate, namely the fact that the linearized flow contracts all 
but finitely many directions. 

This work has its intellectual roots in many papers. In finite dimensions, 
spectral gaps in weighted total variation norms like (25) have been ob- 
tained for some time [40], but these estimates are of course not uniform 
when (SNS) is approximated by a sequence of finite-dimensional systems 
(say by spectral Galerkin approximations). In [46], spaces of observables 
weighted by Lyapunov functions are used to prove the existence of solu- 
tions to infinite-dimensional Kolmogorov equations. The convergence of ob- 
servables dominated by Lyapunov functions was also given in [27, 38] in 
the "essentially elliptic" case. The results obtained there were, however, far 
from what is needed to prove a spectral gap. The convergence results are 
direct descendants of those developed by many authors in, among others, 
[6, 14, 20, 28, 34, 37, 38, 43]. All of these works make use of a version of 
the Foias-Prodi-type estimate [18], introduced in the stochastic context in 
[35]. The later papers also use a coupling construction to prove convergence. 
In particular, [20, 37, 38] developed a coupling construction to prove expo- 
nential convergence. Though in a less explicit way than its predecessors, the 
present work makes use of both ideas. 
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APPENDIX: PRIORI BOUNDS ON THE DYNAMICS 

This appendix is devoted to the proof of the technical estimates used 
throughout the last two sections of this article. The techniques used to de- 
rive these estimates are standard. Even though most of these bounds are 
probably known to the experts in this field, we have not always been able 
to find references that state them in the form required here. In particular, 
we need precise bounds on the difference between the solutions (and their 
Jacobians) for two nearby initial conditions. 

We define for a £ R and for w a smooth function on [0, 2ir] 2 with mean 
the norm \\w\\ a by 



12 - E \k\ 2a ™l 

fcez 2 \{o,o} 

where of course Wk denotes the Fourier mode with wavenumber k. Define fur- 
thermore (JCw)k = — iwkk L / \\k\\ 2 . If v, u\ and U2 are as w and u = (u\,U2), 
then -B(u, v) = (u • S7)v. Setting S = {s = (s\, S2, S3) G R+ :J2 s i — 1, s 7^ 
(1, 0, 0), (0, 1, 0), (0, 0, 1)} and keeping u, v and w as above, then the fol- 
lowing relations are useful (cf. [7]): 

(35) (B(u,v),w) = -(B(u,w),v) if V • u = 0, 

(36) \{B(u,v),w)\ <C\\u\\ Sl \\v\\i +S2 \\w\\ S3 , (s 1 ,s 2 ,s 3 ) G5, 

(37) ||B(it,'u)|| a <C Q |H|||'!;|| if a < -2 and V-n = 0, 

(38) \\fcv\\ a = \\v\\ a -i, 

(39) ||«||?<e||«l|«+e- 2((7 ~^- a)) l|«l|? 

if0<a</3<7 and e > 0. 

We start with the following set of a priori bounds, most of which were taken 
from [21] and [39]. 

Lemma A.l. The solution Wt of the 2D stochastic Navier-Stokes equa- 
tions in the vorticity formulation satisfies the following bounds: 

1. There exist constants C, 77^,7 > 0, such that 

(40) Eexp^z/ J r}\\w r \\\dr — j(t — s) \ < Cexp(?/||^ || 2 ), 

for every t > s > and for every r]<r] ir . 

2. For every N > 0, every t>0 and every n > 0, there exists a constant C 
such that one has 

(41) E|K||f <Cexp(77|K|| 2 ), 
for every initial condition wq £7i. 
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3. There exist constants > and C > such that for every t > and 
every rj < 77*, i/te bound 

(42) EexpMKH 2 ) < Cexp(77e (( -^ )/2) ||^ || 2 ) 
holds. 

4. For every rj > 0, i/iere exists a constant C > siic/i £/ia£ tae Jacobian Jo,t 
satisfies almost surely 

(43) ||J ot || <exp(r)J o \\w 8 \\l ds + Ctj , 
for every t>0. 

5. For every n > and every T > 0, i/iere exists a constant C such that 

(44) jT || Jo, s £||?d S < C||e|| 2 exp ^ K||?ds) , 

/or every £ E?i and every t G [0, T] . 

6. For every n > i/iere exists a constant C such that 

(45) ||J ^||?<C||e|| 2 exp^V,ll?^ + Ci), 

almost surely, for every t>0 and for every £ G H. 

7. For every 77 > and every p > 0, there exists C > suc/i £aai i/te Hessian 
Kqj satisfies 

(46) Elli^ll^CexpMKH 2 ), 
/or evert/ t G [0, 1] . 

Remark A. 2. It is straightforward to verify that if one fixes a K\ > 
and K% > 0, the constants C, rj+ and 7 from the statements in Lemma A.l 
can be chosen uniformly over all v > K\ and ||Q||, ||/|| < K^. 

Proof of Lemma A.l. Points 1, 4 and 7 are taken from Lemma 4.10 
in [21]. Point 2 follows from Lemma A. 4 in [39] and point 6 follows from 
Lemma B.l in [39]. Point 3 follows immediately from (28). 

It remains to show Point 5. It follows from the linearization of the Navier- 
Stokes equations that 

ll^o,t£|| 2 -||e|| 2 = -2^ r||Jo,^llids+ f\jo^,B(KJo^,w 8 ))d8. 
Jo Jo 

Using (36), this in turn implies that 
/"* ||£|| 2 1 /"* 

/ l|Jo,s£||i ds < — h— / ||w s ||i||Jo,sf||||Jo,s£||ids 

J0 2^ Zv Jo 

||C||2 I ft I ft 

^ j T L + ^ IKIIill«|| 2 d S + - / ||Jo,,e|| 2 d S . 
Iv oi/ z Jo 2 Jo 
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It thus follows from (43) that 

f \\JvA\\l ds < ^f- + C||e|| 2 exp^* \\w e \\lda + Ctj jf* \\w s \\lds, 
and the result follows immediately. □ 

In the remainder of this section, we use the following notation, which is 
the same as in the proof of Theorem 5.10. We fix an element w £TC and a 
natural number n > 0. We denote by II n the orthogonal projection in TC onto 
the Fourier modes with \k\ < n and we write w = H n w. We denote by <3?t the 
random flow solving (27) and set wt = 3>t(w), w t = <&t{w), pt = w t — w t . We 
also use the notation 

J t = (D<S> t )(w), J t = (D<t> t )(w), J P ,t = Jt-Jt. 

The aim of this section is to show that, given t > and provided n is large 
enough, it is possible to make p% and J P) t arbitrarily small. More precisely, 
the main result of this section is: 

Theorem A. 3. For every 7 > 0, every T > and every 7] > there 
exists n > such that 

EUptII 2 < 7exp(r?||u;|| 2 ), E|| J P) t\\ 2 < 7exp(r/||w|| 2 ), 

for every w £TC. 

We define the family of increasing stochastic processes F^{t) by 

FP(t) =exp(2? ? f (|K||? + |K||?)dsVl+ sup (IKH + IKHf 
V Jo J \ se[o,t] 

Note that one has the following result, the proof of which is a trivial appli- 
cation of the a priori bounds from Lemma A.l: 

Lemma A. 4. For every r/ > 0, every t > and every p > there exist 
rjQ > and C such that 

B(FP(t))<Cexp( V \\w\\ 2 ), 
uniformly for every n > 0, every w E H and every £ £ [0, 770] • 

Proof of Theorem A. 3. We fix a terminal time T > and start 
with the bound for ||/3r||, which is almost identical to the proof of [21], 
Lemma 4.17. Note first that p solves the equation 



d t pt = vl±p t + B(p t ,w t + w t ), 



40 M. HAIRER AND J. MATTINGLY 

where we set B(w,w) = B(Kw,w) + B(JCw,w). Define p\ = H n pt and p\ = 
pt — pf> so that 

dtUf = -2u\\pi\\l + {B{Kpl w t + w t ),pi) 

- (BilCp^p^^t + w t ) - (B{Kw t +}Cwt,pi),p t ), 

dtWrtf = -v\\pt\\l ~ {B{K Pt ,p h t ),w t + w t ) - (B(JCw t + Kw uP h t ) lPt ). 

The initial conditions for these equations are given by 

p e = 0, Po=U n w. 

The equations satisfied by pi and p\ are the same as the ones appearing in 
the proof of [21], Lemma 4.17, so that we get the bounds 

II^H 2 <lkl| 2 (e-^ 2t + ^(i)), 

llPtll 2 < C v exp^??^ \\w r + w r \\l^\\w s + w s \\l /2 \\ps\\ 2 ds 

<C v F*{t) J*\\w 8 + w a \\i\\(%\\d8. 
These bounds are valid for every r] > 0. It follows from the first bound that 

[ T \\rtfd8<-F*(T), 
Jo n 1 

so that the second bound yields 

(47) sup || Pi || 2 <-^F 2 yT). 

te[o,T] V n 

The bound on E||/9r|| 2 then follows from Lemma A. 4. 
In order to bound J p ,Ti note first that J Pi o = and 



d t J P ,t = vAJpj + B(J Pt t,w t + w t ) + B(J t + Jt,pt)- 
Fix now a tangent vector £ 6H. It follows from (36) that 

5 t ||j p , t eii 2 <-2^i|j pit £ii 2 + ci|j p , t £ii 1/4 ||j p , t eiiii^+^i| 1 
+cii^eiiiiiPtiiii^+^iii/4 

< (c v + v \\w t + w t \\ 2 )\\J P M 2 + WptfWM + M\\ 2 1/4 - 

This bound is valid (with different values for the constant C n ) for any value 
of 77 > 0. It immediately implies that 

II J p ,t£|| 2 < F ^ T )[ WpttPtt + Uh/iPti + fell dt 
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<CFi ri (T)U\\J o \\p t \\\\M + MW1/2 dt 
< cf! v (t)U\\ 3/2 £ \\ P t II II M + M\\l /2 dt, 

where we made use of (43). It follows that there exists a constant C such 
that, for every a > 0, one has the bound 

II J P H\\ 2 < Q [ Wptf dt + aCF U T )) ll^ll 2 + a jf (II J ^\i + II-&II?) dt - 

It follows from (44) that 

II J p ,t II 2 < \\pt II 2 dt + aCff, (T)) , 

so that the claim follows by combining Lemma A. 4 with the bound (47). □ 
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